Add Oriented Bounding Box (OBB) support for rotated object detection#921
Add Oriented Bounding Box (OBB) support for rotated object detection#921farukalamai wants to merge 15 commits intoroboflow:developfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #921 +/- ##
=======================================
+ Coverage 79% 80% +1%
=======================================
Files 97 99 +2
Lines 7793 8061 +268
=======================================
+ Hits 6148 6410 +262
- Misses 1645 1651 +6 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds Oriented Bounding Box (OBB) support across the RF-DETR training/inference stack to enable rotated object detection using [cx, cy, w, h, angle] boxes.
Changes:
- Introduces rotated box math utilities (corner conversion + Gaussian-based GWD/KLD/ProbIoU).
- Wires an
orientedflag through config/namespace/model, adding an angle prediction head and oriented matching/loss/postprocess paths. - Adds a DOTA v1.0 dataset loader and a broad new unit test suite for OBB components.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/rfdetr/utilities/rotated_box_ops.py |
Adds rotated box conversions and Gaussian-based similarity/loss utilities. |
src/rfdetr/datasets/dota_detection.py |
New DOTA dataset loader + transforms/normalization for OBB training. |
src/rfdetr/models/matcher.py |
Adds oriented matching path using GWD cost and boxes_obb. |
src/rfdetr/models/criterion.py |
Adds oriented loss path using KLD and ProbIoU for IoU-aware classification. |
src/rfdetr/models/postprocess.py |
Adds oriented postprocessing output (boxes_obb + corners). |
src/rfdetr/models/lwdetr.py |
Adds angle head + forward/export plumbing + passes oriented into postprocess. |
src/rfdetr/models/heads/detection.py |
Adds optional angle prediction in the standalone detection head. |
src/rfdetr/models/_types.py |
Extends builder args protocol to include oriented. |
src/rfdetr/datasets/__init__.py |
Registers DOTA dataset builder. |
src/rfdetr/config.py |
Adds ModelConfig.oriented and TrainConfig.dataset_file="dota". |
src/rfdetr/_namespace.py |
Forwards oriented into the legacy builder namespace. |
pyproject.toml |
Adds codespell ignore for “dota” and mypy override for new dataset module. |
tests/utilities/test_rotated_box_ops.py |
Unit tests for rotated ops and losses. |
tests/datasets/test_dota_detection.py |
Unit tests for DOTA parsing/dataset behavior. |
tests/models/test_obb_* |
Tests for oriented head, matcher/criterion, postprocess, and export. |
tests/training/test_obb_integration.py |
Integration tests for config/namespace wiring. |
| if image_set == "train": | ||
| resize_wrappers = AlbumentationsWrapper.from_config( | ||
| [ | ||
| {"Resize": {"height": resolution, "width": resolution}}, | ||
| ] | ||
| ) | ||
| aug_wrappers = AlbumentationsWrapper.from_config( | ||
| [ | ||
| {"HorizontalFlip": {"p": 0.5}}, | ||
| {"VerticalFlip": {"p": 0.5}}, | ||
| {"RandomRotate90": {"p": 0.5}}, | ||
| ] | ||
| ) | ||
| return Compose([*resize_wrappers, *aug_wrappers, to_image, to_float, normalize]) |
There was a problem hiding this comment.
make_dota_transforms() builds a pipeline out of AlbumentationsWrapper geometric transforms, but AlbumentationsWrapper only transforms targets when the target dict contains a "boxes" key (see rfdetr/datasets/transforms.py:642-645). DOTA targets only provide "corners" / "boxes_obb", so resize/flip/rotate will modify the image while leaving the geometry unchanged, producing incorrect boxes_obb after DotaNormalize (and potentially silently training on wrong labels). Consider either (a) extending AlbumentationsWrapper (or a DOTA-specific wrapper) to transform the 4 corner keypoints via Albumentations keypoint_params, or (b) avoiding Albumentations geometric transforms here and implementing corner-aware transforms (including consistent filtering when boxes become invalid).
| results = [ | ||
| {"scores": sc, "labels": lb, "boxes_obb": ob, "corners": cn} | ||
| for sc, lb, ob, cn in zip(scores, labels, obb, corners) | ||
| ] |
There was a problem hiding this comment.
When oriented=True, PostProcess.forward() returns early with boxes_obb/corners and skips the existing mask gathering/resizing logic. If segmentation_head is enabled alongside oriented boxes, postprocessing will drop pred_masks entirely. Consider handling the out_masks branch in oriented mode as well (gather the same top-K masks and resize to target_sizes), and include them in the returned dicts.
| results = [ | |
| {"scores": sc, "labels": lb, "boxes_obb": ob, "corners": cn} | |
| for sc, lb, ob, cn in zip(scores, labels, obb, corners) | |
| ] | |
| results = [] | |
| if out_masks is not None: | |
| for i in range(out_masks.shape[0]): | |
| res_i = { | |
| "scores": scores[i], | |
| "labels": labels[i], | |
| "boxes_obb": obb[i], | |
| "corners": corners[i], | |
| } | |
| k_idx = topk_boxes[i] | |
| masks_i = torch.gather( | |
| out_masks[i], | |
| 0, | |
| k_idx.unsqueeze(-1).unsqueeze(-1).repeat(1, out_masks.shape[-2], out_masks.shape[-1]), | |
| ) # [K, Hm, Wm] | |
| h, w = target_sizes[i].tolist() | |
| masks_i = F.interpolate( | |
| masks_i.unsqueeze(1), | |
| size=(int(h), int(w)), | |
| mode="bilinear", | |
| align_corners=False, | |
| ) # [K,1,H,W] | |
| res_i["masks"] = masks_i > 0.0 | |
| results.append(res_i) | |
| else: | |
| results = [ | |
| {"scores": sc, "labels": lb, "boxes_obb": ob, "corners": cn} | |
| for sc, lb, ob, cn in zip(scores, labels, obb, corners) | |
| ] |
| if self.oriented: | ||
| loss_bbox = F.l1_loss(src_boxes[..., :4], target_boxes[..., :4], reduction="none") | ||
| losses["loss_bbox"] = loss_bbox.sum() / num_boxes | ||
| loss_kld = kld_loss(src_boxes, target_boxes) | ||
| losses["loss_giou"] = loss_kld.sum() / num_boxes |
There was a problem hiding this comment.
In oriented mode, loss_boxes() stores the KLD regression term under the key "loss_giou". This keeps the existing weight_dict wiring working, but it makes logs/metrics misleading ("giou" is no longer GIoU). Consider additionally emitting a correctly named key like "loss_kld" (or similar) for logging/monitoring while retaining "loss_giou" for backward compatibility, or add an inline comment clarifying the semantic change.
| class TestBuildDota: | ||
| def test_builds_dataset(self, dota_root: Path) -> None: | ||
| import types | ||
|
|
||
| args = types.SimpleNamespace(dataset_dir=str(dota_root.parent)) | ||
| root_with_split = dota_root.parent / "train" | ||
| root_with_split.mkdir(exist_ok=True) | ||
| (root_with_split / "images").mkdir(exist_ok=True) | ||
| (root_with_split / "labelTxt").mkdir(exist_ok=True) | ||
| img = Image.new("RGB", (50, 50), color="blue") | ||
| img.save(root_with_split / "images" / "test.png") | ||
| (root_with_split / "labelTxt" / "test.txt").write_text("") | ||
| args.dataset_dir = str(dota_root.parent) | ||
| dataset = build_dota("train", args, 256) | ||
| assert isinstance(dataset, DotaDetection) |
There was a problem hiding this comment.
TestBuildDota.test_builds_dataset only asserts that build_dota() returns a DotaDetection instance, but it never calls dataset[0]. Since build_dota() wires up a transform pipeline, this test currently won’t catch transform/target-sync issues (e.g., geometric transforms not updating corners / boxes_obb). Consider adding a __getitem__ assertion that verifies shapes and that boxes_obb stays in normalized [0,1] coords after transforms.
What does this PR do?
Adds OBB (Oriented Bounding Box) support to RF-DETR, enabling rotated object detection with
[cx, cy, w, h, angle]box format. This includes rotated box math utilities, DOTA v1.0 dataset loader, angle prediction head, Gaussian-based matching/loss functions, and oriented postprocessing.When
oriented=Trueis set inModelConfig, the model predicts a 5th dimension (rotation angle in radians) per box, uses KLD loss for regression, GWD cost for Hungarian matching, and outputs rotated corner points.Related Issue(s): Fixes #56
Type of Change
Testing
Test details:
75 new tests covering all OBB components:
test_rotated_box_ops.py(34 tests) — angle normalization, box conversions, roundtrips, GWD/KLD/ProbIoU losses, gradient flow, edge cases (zero-size boxes, large boxes)test_dota_detection.py(21 tests) — annotation parsing, dataset loading, filtering, normalization, empty/missing filestest_obb_head.py(8 tests) — detection head output shapes, angle range, gradient flowtest_obb_matcher_criterion.py(6 tests) — oriented matching, KLD loss computationtest_obb_postprocess.py(5 tests) — output keys, shapes, scaling, batch supporttest_obb_export.py(2 tests) — ONNX export produces 5D output with valid anglestest_obb_integration.py(5 tests) — config flag, namespace forwarding, dataset_file acceptanceAll existing tests pass unchanged.
Checklist
Additional Context
Key design decisions:
angle_embedMLP rather than extendingbbox_embedto 5D — keeps spatial box prediction unchangedbox_iouwhen orientedoriented: bool = Falseon ModelConfig — zero impact on existing non-oriented modelsFiles changed
New files:
src/rfdetr/utilities/rotated_box_ops.py— box conversions, Gaussian encoding, GWD/KLD/ProbIoUsrc/rfdetr/datasets/dota_detection.py— DOTA v1.0 dataset loader with annotation parsertests/— 7 new test files (75 tests)Modified files:
src/rfdetr/config.py— addedorientedflag and"dota"dataset optionsrc/rfdetr/_namespace.py— forwardorientedto namespacesrc/rfdetr/models/_types.py— addedorientedtoBuilderArgsprotocolsrc/rfdetr/models/heads/detection.py— addedangle_embedMLP when orientedsrc/rfdetr/models/lwdetr.py— angle prediction in forward pass, zero-init, builder wiringsrc/rfdetr/models/matcher.py— GWD pairwise cost for oriented matchingsrc/rfdetr/models/criterion.py— KLD loss for oriented boxes, ProbIoU for ia_bce_losssrc/rfdetr/models/postprocess.py— oriented output withboxes_obbandcornerssrc/rfdetr/datasets/__init__.py— registered DOTA dataset builder